Research on the insurance of swimming crab temperature and salinity index insurance based on Copula function

Under climate change, the sea surface temperature and salinity change greatly, which poses a considerable threat to sustainable food security. Sea surface temperature and salinity (SST/SSS) are selected to examine the annual output of swimming crab in 24 cities along the eastern China. The Copula-based function was used to construct the probability distribution model of the swimming crab yield with SST and SSS. The pure premium rate of the swimming crab production in these 24 cities are also examined. The results show that 1) There is significant positive correlations between the yield of swimming crab with temperature and salinity over the study area. The only exception is that the correlation between yield of swimming crab and salinity is not significant in the south of study area. 2) The span of the pure insurance premium rate of swimming crab in 24 cities increases rapidly with the increase of the protection level, the maximum span up to 2.04%, and the minimum span is only 1.6%. 3) The distribution of the swimming crab insurance premium rate is various in space. The insurance premium rate of 8 cities in the south of Taizhou is low with the highest premium rate at 5.6%. The insurance premium rate of 16 cities in north of Taizhou is relatively high with the rate between 6%-22%. The research can provide a theoretical basis for the pricing of insurance products for swimming crab in 24 cities in the typical aquaculture areas in eastern China.


Introduction
Under climate change, extreme weather events are likely to become more frequent and intense, which have a great impact on mariculture [1]. The direct economic loss caused by meteorological disasters to China's mariculture industry is as high as 15.637 billion yuan in 2019. Insurance products compensated for losses which played a particularly prominent role in compensation for aquaculture losses [2,3]. The swimming crab with rich nutrition and delicious taste is an important aquaculture species in China, which produce large economic benefits. It has been listed as a key aquaculture product in China in 1981 [4]. The yield of swimming crab is affected by natural factors, breeding technology, and labor, etc. Sea surface temperature and salinity (SST/SSS) are the most important environmental factors affecting the growth conditions and development of swimming crab, including the survival rate, production, breeding cost, pests and diseases. The resulting losses are often high and unbearable for smallholders. Therefore, we choose SST and SSS as variables to determine swimming crab premium rates, which can theoretically improve the accuracy of the rate results. There was a certain practical significance to investigate the swimming crab index insurance. Index insurance was proposed to link the uncertainty of crop yield with meteorological disasters as early as 1999 [5]. The first application of index insurance in the Chinese aquaculture industry is the Neitang crab hydrological index insurance product, which was developed by the People's Insurance Company of China (PICC) Jiangsu Branch in 2012 [6]. Currently, there are several major studies on index insurance in the following aspects: 1) In terms of spatial scale, the relevant researches mainly select the national, provincial and municipal region as the research area. Some studies determine the insurance rate by establishing the relationship between the production of aquaculture products and natural factors, and some studies compare the differences of index insurance products in different study areas. [7][8][9][10]. The current study lacks a particular study of the insurance rates in a single aquacultural product in the entire eastern coastal area of China. 2) In terms of data source, index insurance researches mainly designed based on data of observed wind [11], temperature [12], precipitation [13,14], sunshine duration [15] and other data monitored by meteorological stations. Previous theoretical researches are continuously applied to practice. The current studies mainly investigate the relationship between farmed product and single factors. However, the growth status of aquaculture products is jointly affected by multiple meteorological and hydrological factors, and the changes in SST and SSS will directly affect the yield of aquaculture products. In view of this, SST and SSS as influencing factors could be applied to design index insurance products for multi-factors. 3) In terms of research method, the analytical approaches of aquatic insurance are mainly classified into two categories: empirical methods and statistical analysis methods. The former includes the empirical rate method and the loss ratio method. The latter can estimate the linear relationship between influencing factors and yield by using the trend fitting analysis method. Furthermore, several scholars simulate the occurrence probability of meteorological disasters based on Weibull function model. They establish a univariate nonlinear regression model between influencing factors and yield through the Copula function to design insurance products with different meteorological indices [16][17][18], such as high-temperature index [19], precipitation index, continuous cloudy and rainy index [20].
Over the few decades, researchers and practitioners alike have contributed to an ongoing debate around weather index insurance's feasibility and its merits as a tool for agricultural risk management. Most of the insurance products are designed by constructing a linear relationship between the production and environmental factors, but not all relationship between them is linear over different regions. Giacomo indicated that the Copula function can be used to measure the nonlinear relationship between variables and calculate the corresponding pure insurance rate accordingly [21]. Currently, the Copula function are usually applied in the design of index insurance products for planting crops [22], but there are few applications in the analysis of aquaculture insurance. Therefore, we attempt to use Copula function to determine the premium rate of aquaculture products. In this study, a combination of nonparametric kernel density estimation and Copula function is used to construct the probability distribution model between yield of swimming crab, SST and SSS data in typical aquaculture areas in eastern China from 2001 to 2019.

Study area
Typical areas of aquaculture in eastern China contain 24 coastal cities from Qinhuangdao to Zhangzhou (Fig 1). The swimming crab is cultivated for the water depth range at 0 to 30 meters. The SST and SSS are easily affected by solar radiation, weather conditions and internal waves. Eastern China is affected by the East Asian monsoon with large spatial and temporal  There are two stages of artificial breeding of swimming crabs, which are seedling (February to June) and crab breeding (June to February). The existing artificial seedling technology is quite mature. The breeding period of commercial crab can be subdivided into the growth period (June to November) and the breeding period (November to February). Swimming crab is a eurytherm and euryhaline species. It can survive and adapt to the water environment with water SST between 8˚C and 31˚C (the optimum SST is 15.5˚C-26˚C) and SSS of 16‰-35‰ (the optimum SSS is 26‰-32‰). If SST and SSS exceed a certain threshold level (SST<15˚C and salinity>32‰), two factors can greatly affect the biological processes of swimming crab such as growth, reproduction, feeding, molting and output of the crab roe [23]. Thus, taking the coastline in China extending 100 km eastward as the study area in this study. The extreme values data of SST and SSS with three-day resolution from 2001 to 2019 are extracted resolution to investigate the effects of low SST and high SSS on the yield in the two stages of the growth period and the breeding period of commercial crab.

Data
The annual yield data of swimming crabs from 2001 to 2019 were collected from the Municipal Statistical Yearbooks of 24 cities. We used Simple Ocean Data Assimilation (SODA) reanalysis data sets in this study, which is jointly developed by the University of Maryland and the Texas A&M University. The SODA reanalysis is a global ocean re analysis created by assimilating observational data into an ocean general circulation model based on the POP model. It provided three-day averaged gridded variables (Sea surface salinity (SSS) (http://apdrc.soest. hawaii.edu/erddap/griddap/hawaii_soest_face_930e_5a79.html) and sea surface temperature (SST) (http://apdrc.soest.hawaii.edu/erddap/griddap/hawaii_soest_1b16_9fed_896e.html)) with a horizontal resolution of 0.1˚×0.1˚. The temporal and spatial resolutions are relatively coarse, but the reanalysis data can still be evaluated the relationship between yield and variables resolves most of mesoscale eddies in our study regions. The materials not included in this study are available on the OSF.

Statistical model of yield factor.
The actual yield of swimming crab is affected by breeding technology, management level and natural disasters during its growth period. Therefore, it is necessary to distinguish the impact of natural and unnatural factors on yield when analyzing aquaculture insurance. Generally, the actual yield can be divided into trend yield, meteorological yield and random yield [24]. Due to the short time series selected, HP (Hodrick-Prescott) filter model can be better separate trend yield and meteorological yield (difference between actual yield and trend yield). The HP filter assumes that the yield sequence y t is composed of trend term g t and fluctuation term c t . The trending term is determined by minimizing the loss function: Where n is the number of samples; λis smoothing parameter. According to experience, when annual data is used, the best fitting effect is λ = 100. Combining y t = g t +c t and Eq (1), we can be obtained the trend term as follows: HP filter has been widely applied in the field of yield fluctuation. The following is converted into detrended yield taking 2019 as the base period: Where t = 2001, . . ., 2019; Y t is the original yield sequence;Ŷ t is the trend yield in year t.

Nonparametric kernel density estimation.
Unlike parametric methods, the nonparametric kernel density estimation does not need to make definite assumptions about the distribution of the data. It relies on the characteristics of data to obtain its distribution. The kernel function and bandwidth can achieve the fitting of the probability density. It can not only be widely used, but also makes the fitted probability destiny functions (PDF) closer to the real information.
Respectively, taking the variables yield, SST, and SSS data to calculate the volatility of the variables, the probability density function f(x) can be expressed as [25]: Where K(x) is the kernel function; n is the sample size; h is the window width; It is necessary which serves to calculate the premium rate of swimming crab. The expression is: According to Silverman's rule of thumb, the optimal bandwidth of the Gaussion kernel function is:

Construction joint probability model for yield and SST or SSS.
Copula concept is proposed as a function which can combines one-dimensional marginal distribution functions to form a multivariate distribution function between two or more random variables by Sklar in 1959 [26][27][28]. The Sklars theorem propose: If the binary the joint distribution function of random vector X, Y is H and has continuous marginal distribution functions are F, G, then there is a Copula function C(�): Where X and Y are SST and SSS factors respectively; F −1 and G −1 mean the generalized inverse of F and G, that is, F À 1 ðuÞ ¼ sup z fFðzÞ � ug and G À 1 ðvÞ ¼ sup z fGðzÞ � vg; F and G are marginal distribution functions of yield and SST and SSS factors respectively; C is the joint distribution of marginal distribution on [0,1]; then the density function of the Copula function expressed as: If the derivative exists, the joint density of X and Y is given by f ðx; yÞ ¼ cðFðxÞ; GðyÞÞf ðxÞgðyÞ, where f and g are the probability density functions of F and G. Currently, the Copula function is applicable for pricing crop income insurance in the field of insurance finance. The nonlinear dependence relation between variables can be calculated by constructing joint distribution among variables. There are four common types of Copula functions selected in this paper including Normal Copula, Gumbel Copula, Frank Copula, and Clayton Copula. The specific structure of the Copula functions is uniformly expressed as in: Where φ: [0,1]![0,1], is the generator of the Copula function, and is a continuous strictly decreasing function such that φ(1) = 0; φ −1 mean the inverse of φ, is a continuous and non- The generators and equations of the Copula function are shown in Table 1.
The joint distribution of yield, SST and SSS parameters are separately estimated by the maximum likelihood method. MATLAB software is used to calculate the square Euclidean distances between the four Copula functions and the empirical Copula functions, respectively. The optimal copula adopts the principle of least squared Euclidean distance from the empirical copula. Copula function has correlation information between yield and SSS, yield and SST, and is embodied by Kendall-τ correlation coefficient.
2.3.4 Pure premium rate. Monte Carlo simulation model is used to separately determine premium rates for the yield and SSS, yield and SST. Firstly, the Copula function form and its parameters calculated in the previous step are sampled 10,000 times by Monte Carlo, and a random sequence u, v conforming to the [0,1] distribution is simulated and generated. Secondly, the inverse function of the marginal distribution is calculated by the spline interpolation method. Finally, the yield, SST and SSS data are obtained through reduction processing, and the newly generated yield is taken as the yield sample data under such SST or SSS conditions, and then substituted into the following premium calculation formula to determine the

Clayton Copula
Where α is the coverage level, α2[0,1]; � y is the expected yield level, substitute with the average long-term yield of swimming crabs.

Data preprocessing
A stationarity test is applied to the original data and the results show that the original yield, SST and SSS series of most cities were non-stationary. Then, the non-stationary series are detrended to ensure the accuracy of the simulation results. The non-stationary original series of yield, SST and SSS are decomposed by the HP filter model. Moreover, the yield is converted into the yield data with 2019 as the base period. The ADF stationarity test is performed again on the detrended yield, SSS and SST series of 24 cities (Table 2). Because the yield, SSS and SST data have different dimensions, the normalization method is required for dimensionless processing.
There are significant differences in skewness and kurtosis of yield after detrending in 24 cities (Fig 2). The kurtosis curve of yield shows a fluctuating upward trend with decreasing latitude (Fig 2A). The yield is mostly in the form of "low peaks and thin tails" in the cities north of Yancheng, and mainly in the form of "sharp peaks and thick tails" in the cities south of Yancheng. The skewness and kurtosis of SSS and SST show similar spatial variation trends in the study area. The skewness and kurtosis of SSS show a decreasing trend (Fig 2B), while the skewness and kurtosis of SST show an increasing trend (Fig 2C). Since kurtosis values of SST and SSS in cities are both less than 3, except for Yancheng, Taizhou, Ningde, and Quanzhou, indicating that the SST and SSS series have a "low peak and thin tail" shape on the whole. The morphology of the analyzed data series is used to compare with the fitted kernel density function plot to test whether the nonparametric estimation method fits well.

Determine the marginal distribution.
The parametric estimation and the nonparametric kernel density estimation method are used to simulate the probability distributions trend yield, SST and SSS, and the results produced by nonparametric method can be more accurate than the results given by parametric estimation. In this study, four different kernel functions such as Uniform, Triangular, Epanechnikov and Gaussian are used to fit the marginal distribution of variables. Taking Ningbo as an example, the kernel density estimation results are analyzed. According to the Silverman's rule-of-thumb, the window widths of yield, SST and SSS of swimming crab Ningbo are calculated as 0.084, 0.173 and 0.195, respectively. The Gaussian kernel function estimates the distribution of yield, SST and SSS pretty accurately (Fig 3), which is consistent with the results of descriptive statistical analysis. Similarly, the Gaussian kernel function also has the best fitting in other cities. Therefore, the premium rate of swimming crabs is calculated by the marginal distribution function fitted by the Gaussian kernel function.

Copula parameter estimation.
The Kendall-τ correlation coefficient is used to calculate the correlation between yield and its influencing factors. The optimal Copula function correlation coefficient between yield and SSS is negative in 7 cities such as Cangzhou (Table 3), which indicates that a large increase in SSS may lead to yield reduction. On the contrary, the Kendall-τ correlation coefficient of 9 cities such as Qinhuangdao is greater than 0, showing a trend of increasing production, while the Kendall-τ correlation coefficient of 8 cities such as Dongying is close to 0. It indicates that the relationship between the yield and SSS is different in space. We also examine the correlation between SST and yields and found that there is a weak negative correlation except for Ningbo, Wenzhou and Quanzhou, and all other cities are positive correlations. It is shown that the decrease in SST led to the reduction in yields.

Index insurance premium rates
The pure premium rate can be calculated using Eq (6,7) for a given level of coverage (from 70 to 100% with an interval of 5%). The results are showed in Table 4. The pure premium rate increases rapidly with the increasing in level of coverage. Within the range of 70%-100% of the coverage level, premium rates of SST and SSS index insurance in the same city fluctuates by 10%. When the coverage level is higher, the increase in the variation of pure premium rate is larger. Especially when the coverage level is 100%, the pure premium rate increase span is between 1.6% and 2.04%. The pure premium rates of swimming crabs in different cities vary greatly. It can be seen that the premium rates of Taizhou and 8 cities south of Taizhou are lower than the 16 cities north of Taizhou. When the coverage level is 100%, the premium rates is between 6% and 22% for most cities in north of Taizhou and less than 5.6% for cities in the south of Taizhou. There will be no expected loss when the coverage level is lower than 85% for cities in the south of Taizhou. The premium rates for different provinces are also examined, and find that the premium rates of swimming crabs are less in southern Zhejiang and Fujian, higher in Jiangsu, and much higher in Hebei, Shandong, and northern Zhejiang. The rates observed in this study is similar to the rates from previous studies which is determined by different other methods ( Table 5). The correlation coefficient between the rate of the parametric distribution fitting method and the rate that based on the coupling of the Copula function and the nonparametric kernel density estimation is 0.957, and the correlation coefficient with the kernel density estimation results of other scholars is 0.946, both of which pass the 0.05 significance test. It can be stated that the method used in this study can also be used as a rate determination method. Since the 20% relative deductible rate specified in actual insurance products is not considered in this paper, the rate results are higher than those of other scholars. In addition, it is found that the two rates are similar by comparing the rate of the swimming crab precipitation insurance index product actual operating in Xiangshan County with the pure premium rate of the Ningbo swimming crab SST and SSS index insurance calculated in this paper. Therefore, it is of certain practical significance to take advantage of the Copula function to combine production data with SST and SSS data to determine the pure premium rate.

Discussion
There is little literature regarding the determination of the pure premium rate of swimming crab in typical aquacultural area from China's aquaculture insurance for pilots. Currently, there are several ways to verify the rationality of the pure premium rate of swimming crabs in the study area based on nonparametric kernel density estimation and Copula function calculation:

Discussion based on the methods
Currently, the empirical rate method is the most commonly method used to determine the premium rate of aquaculture [31] It is a method of approximately replacing the premium rate with the average loss rate of the calendar year. The nonparametric kernel density estimation method takes the actual recorded data as a sample and does not need to assume a specific distribution form to estimate the loss probability, thus the results of this method will be more accurate and reliable. In theory, the rate calculated by the nonparametric kernel density estimation method can be regarded as an actuarially fair rate. We compare the difference between the empirical rate method and the pure premium rate determined by several other methods and find that the empirical rate is less than Actuarial fair rates, which also explains the high rate of mariculture insurance projects [32].

Discussion based on the large spatial variability
The provinces with higher pure premium rates in the study area are Hebei, Shandong and Zhejiang, while with lower pure premium rates is Fujian. These features are based on the pure premium rate in the study area calculated by the Copula function, which is similar to the results of the risk zoning of crustaceans in the Chinese aquaculture study by Gao et al. [32]. However, when comparing the changes in the pure premium rates of aquaculture with the city as a unit, it shows that the pure premium rates of swimming crabs vary in different cities. It is clear that the significant differences between cities in terms of the aquaculture cultured area, the frequency and intensity of disasters can lead to boost differences in the risk of loss between cities. Thus, the rates under the city scale is higher than those under the provincial scale, and the variation range of pure premium rates in different cities is between 4% and 22% [33]. It can be seen that the current rate standards adopted by insurance institutions are prone to high compensation and high expenses. In other words, applying the same rate on a large scale may cause unfairness [34]. It is suggested that insurance institutions and local governments should actively promote the risk assessment work in coastal county-level areas to lays the foundation for more accurate premium rates.

Discussion based on research data
The premium rates for existing research aquaculture insurance products are as high as 8% or even higher [30], but there are large differences in the premium rates of swimming crabs in different regions. Since there is no standard production record, insurance companies may only obtain sample data on losses of mariculture households in the last 3-5 years to calculate premium rates. Therefore, even the same method is used to calculate different pure rates at different sample sizes [35]. Some short time series are not enough to reflect the actual risk, while some long time series are susceptible to changes in production risk over time in practice, so it is more appropriate to take a sample time series of fewer than 20 years when determining rates.
There are also significant differences in the premium rate of different species. The premium rate of shrimp and scallop can be as high as 21%, and the premium rate of fish is even higher. The reason may be related to the adaptability of cultured species to environmental changes. In addition, the scale of aquaculture also has an impact on insurance rates. Because the area at risk in the event of a disaster are becoming larger as the scale of aquaculture increased. Therefore, the government should adopt different subsidy policies to deal with the situation, which there are large differences in the pure premium rates of swimming crabs in various regions. It is necessary to develop products that meet the needs of the local market according to local conditions. It can reflect that government work is more scientific and fairness.
The following is a summary of the outlook for index insurance based on the current situation of the development of domestic index aquaculture insurance: Local governments should strengthen cooperation with enterprises and universities, establish a comprehensive marine aquaculture disaster database and insurance compensation database, and enrich the city and county-level data. Because different cities and counties have different degrees of aquaculture risks. Aquaculture households are difficult to obtain insurance compensation data, and information acquisition is not timely. If people can design more scientific and reasonable rate insurance products, it will help to promote the stable development of aquaculture. Moreover, it is necessary to strengthen the cross-integration of multidisciplinary knowledge and the innovative application of digital technology. It can combine satellite remote sensing, geographic information, artificial intelligence processing technology, and monitored marine environmental data to improve the accuracy of aquatic insurance judgment and design more sophisticated insurance products. At the same time, it is also significant to enrich the ways of inspection and damage determination and reduce the investment of human and material resources.
There are many natural factors affecting the yield of swimming crabs. It is not comprehensive enough to only consider the effects of SST and SSS on the yield of swimming crabs. In future research, we should comprehensively consider the variety of influencing factors and enlarge the time series length of the yield of swimming crabs and SST and SSS data to design corresponding index insurance to improve the accuracy of determining the index insurance of swimming crabs.

Conclusions
Quantifying the relationship between influencing factors and aquatic yield that is critical to the design of index insurance. We combine nonparametric kernel density estimation with copula function to establish marginal distribution and joint distribution of yield with SST and SSS in the study area from 2001 to 2019. The distribution is sampled using Monte Carlo simulations to obtain a large amount of yield, SST, and SSS data. On this basis, the pure premium rate of swimming crabs in 24 cities is determined under the coverage level of 70%-100%. There are three majorly results as follows: 1. SSS increase in the northern part of the study area made a large contribution to the increase of the yield, while the yield in the southern region is not significantly affected by SSS. The yield of swimming crabs in most cities increases with the increase of SST.
2. The pure premium rate of swimming crabs in each city area varies by less than 10% when the coverage level was between 70% and 100%. Furthermore, the change amplitude of pure premium rate for swimming crabs increase with the higher the level of coverage.
3. The pure premium rates of swimming crab vary significantly across 24 cities. The pure premium rates of swimming crab are low in 8 cities south of Taizhou, suggesting that both temperature and salinity pose less risk to yields in these cities. However, there are enormous differences in pure premium rates for swimming crabs between 16 cities north of Taizhou. The premium rate is between 6% and 22%.